Overview
Brought to you by YData
Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 3407 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 532.4 KiB |
| Average record size in memory | 160.0 B |
Variable types
| Numeric | 5 |
|---|---|
| Categorical | 7 |
| DateTime | 1 |
STATE has a high cardinality: 52 distinct values | High cardinality |
HCP_SPECIALTY has a high cardinality: 78 distinct values | High cardinality |
HCP_GENDER is highly overall correlated with PATIENT_ID | High correlation |
HCP_SPECIALTY is highly overall correlated with PATIENT_ID | High correlation |
INSURANCE_TYPE is highly overall correlated with PATIENT_ID | High correlation |
NUM_CONDITIONS is highly overall correlated with PATIENT_AGE_DIAGNOSED | High correlation |
PATIENT_AGE_DIAGNOSED is highly overall correlated with NUM_CONDITIONS | High correlation |
PATIENT_GENDER is highly overall correlated with PATIENT_ID | High correlation |
PATIENT_ID is highly overall correlated with HCP_GENDER and 6 other fields | High correlation |
STATE is highly overall correlated with PATIENT_ID | High correlation |
TARGET is highly overall correlated with PATIENT_ID | High correlation |
TXN_LOCATION_TYPE is highly overall correlated with PATIENT_ID | High correlation |
INSURANCE_TYPE is highly imbalanced (58.0%) | Imbalance |
PATIENT_ID is uniformly distributed | Uniform |
PATIENT_ID has unique values | Unique |
NUM_CONTRAINDICATIONS has 2434 (71.4%) zeros | Zeros |
PATIENT_AGE_DIAGNOSED has 38 (1.1%) zeros | Zeros |
Reproduction
| Analysis started | 2024-12-17 00:16:38.017140 |
|---|---|
| Analysis finished | 2024-12-17 00:17:52.784577 |
| Duration | 1 minute and 14.77 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
PATIENT_ID
Real number (ℝ)
High correlation  Uniform  Unique 
| Distinct | 3407 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2003.7831 |
| Minimum | 1 |
|---|---|
| Maximum | 4020 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 26.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 199.3 |
| Q1 | 1000 |
| median | 1997 |
| Q3 | 3006.5 |
| 95-th percentile | 3821.7 |
| Maximum | 4020 |
| Range | 4019 |
| Interquartile range (IQR) | 2006.5 |
Descriptive statistics
| Standard deviation | 1160.9316 |
|---|---|
| Coefficient of variation (CV) | 0.5793699 |
| Kurtosis | -1.1992073 |
| Mean | 2003.7831 |
| Median Absolute Deviation (MAD) | 1003 |
| Skewness | 0.0067949401 |
| Sum | 6826889 |
| Variance | 1347762.2 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 1889 | 1 | < 0.1% |
| 1892 | 1 | < 0.1% |
| 1893 | 1 | < 0.1% |
| 1894 | 1 | < 0.1% |
| 1895 | 1 | < 0.1% |
| 1896 | 1 | < 0.1% |
| 1897 | 1 | < 0.1% |
| 1898 | 1 | < 0.1% |
| 1899 | 1 | < 0.1% |
| Other values (3397) | 3397 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 12 | 1 |
| Value | Count | Frequency (%) |
| 4020 | 1 | |
| 4019 | 1 | |
| 4017 | 1 | |
| 4016 | 1 | |
| 4015 | 1 | |
| 4012 | 1 | |
| 4010 | 1 | |
| 4009 | 1 | |
| 4008 | 1 | |
| 4007 | 1 |
PATIENT_GENDER
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 KiB |
| F-Female | |
|---|---|
| M-Male |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.1400059 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M-Male |
|---|---|
| 2nd row | M-Male |
| 3rd row | M-Male |
| 4th row | M-Male |
| 5th row | M-Male |
Common Values
| Value | Count | Frequency (%) |
| F-Female | 1942 | |
| M-Male | 1465 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| f-female | 1942 | |
| m-male | 1465 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 5349 | |
| F | 3884 | |
| - | 3407 | |
| a | 3407 | |
| l | 3407 | |
| M | 2930 | |
| m | 1942 | 8.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 24326 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 5349 | |
| F | 3884 | |
| - | 3407 | |
| a | 3407 | |
| l | 3407 | |
| M | 2930 | |
| m | 1942 | 8.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 24326 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 5349 | |
| F | 3884 | |
| - | 3407 | |
| a | 3407 | |
| l | 3407 | |
| M | 2930 | |
| m | 1942 | 8.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 24326 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 5349 | |
| F | 3884 | |
| - | 3407 | |
| a | 3407 | |
| l | 3407 | |
| M | 2930 | |
| m | 1942 | 8.0% |
NUM_CONDITIONS
Real number (ℝ)
High correlation 
| Distinct | 159 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.632228 |
| Minimum | 1 |
|---|---|
| Maximum | 189 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 5 |
| Q3 | 22 |
| 95-th percentile | 85.7 |
| Maximum | 189 |
| Range | 188 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 29.761452 |
|---|---|
| Coefficient of variation (CV) | 1.5973105 |
| Kurtosis | 7.2789425 |
| Mean | 18.632228 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 2.600783 |
| Sum | 63480 |
| Variance | 885.74403 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 841 | |
| 2 | 363 | 10.7% |
| 3 | 221 | 6.5% |
| 4 | 160 | 4.7% |
| 5 | 122 | 3.6% |
| 6 | 95 | 2.8% |
| 7 | 89 | 2.6% |
| 8 | 84 | 2.5% |
| 10 | 77 | 2.3% |
| 9 | 63 | 1.8% |
| Other values (149) | 1292 |
| Value | Count | Frequency (%) |
| 1 | 841 | |
| 2 | 363 | |
| 3 | 221 | 6.5% |
| 4 | 160 | 4.7% |
| 5 | 122 | 3.6% |
| 6 | 95 | 2.8% |
| 7 | 89 | 2.6% |
| 8 | 84 | 2.5% |
| 9 | 63 | 1.8% |
| 10 | 77 | 2.3% |
| Value | Count | Frequency (%) |
| 189 | 1 | |
| 186 | 1 | |
| 180 | 1 | |
| 179 | 1 | |
| 177 | 1 | |
| 175 | 1 | |
| 174 | 1 | |
| 171 | 1 | |
| 170 | 2 | |
| 165 | 2 |
NUM_CONTRAINDICATIONS
Real number (ℝ)
Zeros 
| Distinct | 79 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.8062812 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 2434 |
| Zeros (%) | 71.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 24 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 12.525443 |
|---|---|
| Coefficient of variation (CV) | 3.2907297 |
| Kurtosis | 213.12078 |
| Mean | 3.8062812 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 10.035368 |
| Sum | 12968 |
| Variance | 156.88671 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2434 | |
| 1 | 228 | 6.7% |
| 2 | 117 | 3.4% |
| 3 | 72 | 2.1% |
| 4 | 42 | 1.2% |
| 5 | 37 | 1.1% |
| 6 | 27 | 0.8% |
| 8 | 26 | 0.8% |
| 7 | 23 | 0.7% |
| 11 | 19 | 0.6% |
| Other values (69) | 382 | 11.2% |
| Value | Count | Frequency (%) |
| 0 | 2434 | |
| 1 | 228 | 6.7% |
| 2 | 117 | 3.4% |
| 3 | 72 | 2.1% |
| 4 | 42 | 1.2% |
| 5 | 37 | 1.1% |
| 6 | 27 | 0.8% |
| 7 | 23 | 0.7% |
| 8 | 26 | 0.8% |
| 9 | 19 | 0.6% |
| Value | Count | Frequency (%) |
| 360 | 1 | |
| 184 | 1 | |
| 99 | 1 | |
| 97 | 1 | |
| 86 | 1 | |
| 81 | 1 | |
| 79 | 2 | |
| 77 | 1 | |
| 76 | 1 | |
| 74 | 1 |
TXN_DT
Date
| Distinct | 72 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.7 KiB |
| Minimum | 2022-04-01 00:00:00 |
|---|---|
| Maximum | 2022-06-30 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
HCP_ID
Real number (ℝ)
| Distinct | 3283 |
|---|---|
| Distinct (%) | 96.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11103.371 |
| Minimum | 2 |
|---|---|
| Maximum | 25342 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 26.7 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 553.3 |
| Q1 | 4041.5 |
| median | 10895 |
| Q3 | 17573 |
| 95-th percentile | 23491.4 |
| Maximum | 25342 |
| Range | 25340 |
| Interquartile range (IQR) | 13531.5 |
Descriptive statistics
| Standard deviation | 7458.8617 |
|---|---|
| Coefficient of variation (CV) | 0.67176549 |
| Kurtosis | -1.223735 |
| Mean | 11103.371 |
| Median Absolute Deviation (MAD) | 6740 |
| Skewness | 0.1424455 |
| Sum | 37829186 |
| Variance | 55634618 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10723 | 6 | 0.2% |
| 18825 | 4 | 0.1% |
| 24319 | 4 | 0.1% |
| 24126 | 4 | 0.1% |
| 13216 | 3 | 0.1% |
| 9144 | 3 | 0.1% |
| 18851 | 3 | 0.1% |
| 22069 | 3 | 0.1% |
| 2112 | 3 | 0.1% |
| 8971 | 3 | 0.1% |
| Other values (3273) | 3371 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 4 | 2 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 19 | 1 |
| Value | Count | Frequency (%) |
| 25342 | 1 | |
| 25320 | 1 | |
| 25185 | 1 | |
| 25180 | 1 | |
| 25171 | 2 | |
| 25158 | 1 | |
| 25155 | 1 | |
| 25149 | 1 | |
| 25133 | 1 | |
| 25122 | 1 |
TXN_LOCATION_TYPE
Categorical
High correlation 
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.5 KiB |
| OFFICE | |
|---|---|
| EMERGENCY ROOM - HOSPITAL | |
| URGENT CARE FACILITY | |
| TELEHEALTH PROVIDED OTHER THAN IN PATIENT'S HOME | |
| HOSPITAL OUTPATIENT | |
| Other values (21) |
Length
| Max length | 55 |
|---|---|
| Median length | 48 |
| Mean length | 20.942471 |
| Min length | 6 |
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | EMERGENCY ROOM - HOSPITAL |
|---|---|
| 2nd row | EMERGENCY ROOM - HOSPITAL |
| 3rd row | OFFICE |
| 4th row | EMERGENCY ROOM - HOSPITAL |
| 5th row | INPATIENT HOSPITAL |
Common Values
| Value | Count | Frequency (%) |
| OFFICE | 1055 | |
| EMERGENCY ROOM - HOSPITAL | 555 | |
| URGENT CARE FACILITY | 541 | |
| TELEHEALTH PROVIDED OTHER THAN IN PATIENT'S HOME | 380 | 11.2% |
| HOSPITAL OUTPATIENT | 337 | 9.9% |
| TELEHEALTH PROVIDED IN PATIENT'S HOME | 125 | 3.7% |
| INPATIENT HOSPITAL | 83 | 2.4% |
| UNASSIGNED | 55 | 1.6% |
| ON CAMPUS-OUTPATIENT HOSPITAL | 54 | 1.6% |
| HOSPITAL INPATIENT (INCLUDING MEDICARE PART A) | 40 | 1.2% |
| Other values (16) | 182 | 5.3% |
Length
| Value | Count | Frequency (%) |
| hospital | 1127 | 11.0% |
| office | 1055 | 10.3% |
| 621 | 6.1% | |
| room | 555 | 5.4% |
| emergency | 555 | 5.4% |
| facility | 543 | 5.3% |
| urgent | 541 | 5.3% |
| care | 541 | 5.3% |
| provided | 530 | 5.2% |
| in | 505 | 4.9% |
| Other values (43) | 3685 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 8410 | |
| 6866 | 9.6% | |
| T | 6797 | 9.5% |
| I | 5990 | 8.4% |
| O | 5381 | 7.5% |
| A | 4726 | 6.6% |
| N | 3613 | 5.1% |
| H | 3590 | 5.0% |
| R | 3495 | 4.9% |
| C | 3175 | 4.4% |
| Other values (18) | 19308 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 71351 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| E | 8410 | |
| 6866 | 9.6% | |
| T | 6797 | 9.5% |
| I | 5990 | 8.4% |
| O | 5381 | 7.5% |
| A | 4726 | 6.6% |
| N | 3613 | 5.1% |
| H | 3590 | 5.0% |
| R | 3495 | 4.9% |
| C | 3175 | 4.4% |
| Other values (18) | 19308 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 71351 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| E | 8410 | |
| 6866 | 9.6% | |
| T | 6797 | 9.5% |
| I | 5990 | 8.4% |
| O | 5381 | 7.5% |
| A | 4726 | 6.6% |
| N | 3613 | 5.1% |
| H | 3590 | 5.0% |
| R | 3495 | 4.9% |
| C | 3175 | 4.4% |
| Other values (18) | 19308 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 71351 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| E | 8410 | |
| 6866 | 9.6% | |
| T | 6797 | 9.5% |
| I | 5990 | 8.4% |
| O | 5381 | 7.5% |
| A | 4726 | 6.6% |
| N | 3613 | 5.1% |
| H | 3590 | 5.0% |
| R | 3495 | 4.9% |
| C | 3175 | 4.4% |
| Other values (18) | 19308 |
INSURANCE_TYPE
Categorical
High correlation  Imbalance 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.9 KiB |
| COMMERCIAL | |
|---|---|
| MEDICARE | |
| MEDICAID | 74 |
| UNSPECIFIED | 5 |
Length
| Max length | 11 |
|---|---|
| Median length | 10 |
| Mean length | 9.5987672 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | COMMERCIAL |
|---|---|
| 2nd row | COMMERCIAL |
| 3rd row | COMMERCIAL |
| 4th row | COMMERCIAL |
| 5th row | COMMERCIAL |
Common Values
| Value | Count | Frequency (%) |
| COMMERCIAL | 2716 | |
| MEDICARE | 612 | 18.0% |
| MEDICAID | 74 | 2.2% |
| UNSPECIFIED | 5 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| commercial | 2716 | |
| medicare | 612 | 18.0% |
| medicaid | 74 | 2.2% |
| unspecified | 5 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 6123 | |
| M | 6118 | |
| E | 4024 | |
| I | 3486 | |
| A | 3402 | |
| R | 3328 | |
| O | 2716 | |
| L | 2716 | |
| D | 765 | 2.3% |
| U | 5 | < 0.1% |
| Other values (4) | 20 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 32703 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| C | 6123 | |
| M | 6118 | |
| E | 4024 | |
| I | 3486 | |
| A | 3402 | |
| R | 3328 | |
| O | 2716 | |
| L | 2716 | |
| D | 765 | 2.3% |
| U | 5 | < 0.1% |
| Other values (4) | 20 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 32703 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| C | 6123 | |
| M | 6118 | |
| E | 4024 | |
| I | 3486 | |
| A | 3402 | |
| R | 3328 | |
| O | 2716 | |
| L | 2716 | |
| D | 765 | 2.3% |
| U | 5 | < 0.1% |
| Other values (4) | 20 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 32703 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| C | 6123 | |
| M | 6118 | |
| E | 4024 | |
| I | 3486 | |
| A | 3402 | |
| R | 3328 | |
| O | 2716 | |
| L | 2716 | |
| D | 765 | 2.3% |
| U | 5 | < 0.1% |
| Other values (4) | 20 | 0.1% |
PATIENT_AGE_DIAGNOSED
Real number (ℝ)
High correlation  Zeros 
| Distinct | 86 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 47.144409 |
| Minimum | 0 |
|---|---|
| Maximum | 85 |
| Zeros | 38 |
| Zeros (%) | 1.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 26.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 29 |
| median | 50 |
| Q3 | 67 |
| 95-th percentile | 84 |
| Maximum | 85 |
| Range | 85 |
| Interquartile range (IQR) | 38 |
Descriptive statistics
| Standard deviation | 24.145581 |
|---|---|
| Coefficient of variation (CV) | 0.51216214 |
| Kurtosis | -0.93776316 |
| Mean | 47.144409 |
| Median Absolute Deviation (MAD) | 19 |
| Skewness | -0.2823762 |
| Sum | 160621 |
| Variance | 583.00909 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 85 | 162 | 4.8% |
| 1 | 79 | 2.3% |
| 36 | 60 | 1.8% |
| 62 | 59 | 1.7% |
| 51 | 58 | 1.7% |
| 52 | 56 | 1.6% |
| 65 | 55 | 1.6% |
| 69 | 54 | 1.6% |
| 59 | 54 | 1.6% |
| 72 | 54 | 1.6% |
| Other values (76) | 2716 |
| Value | Count | Frequency (%) |
| 0 | 38 | |
| 1 | 79 | |
| 2 | 41 | |
| 3 | 30 | 0.9% |
| 4 | 24 | 0.7% |
| 5 | 25 | 0.7% |
| 6 | 32 | |
| 7 | 19 | 0.6% |
| 8 | 19 | 0.6% |
| 9 | 20 | 0.6% |
| Value | Count | Frequency (%) |
| 85 | 162 | |
| 84 | 19 | 0.6% |
| 83 | 25 | 0.7% |
| 82 | 21 | 0.6% |
| 81 | 33 | 1.0% |
| 80 | 28 | 0.8% |
| 79 | 42 | 1.2% |
| 78 | 30 | 0.9% |
| 77 | 45 | 1.3% |
| 76 | 36 | 1.1% |
STATE
Categorical
High cardinality  High correlation 
| Distinct | 52 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.1 KiB |
| CA | |
|---|---|
| TX | |
| FL | |
| NY | |
| MI | 125 |
| Other values (47) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | TX |
|---|---|
| 2nd row | PA |
| 3rd row | MS |
| 4th row | PA |
| 5th row | CA |
Common Values
| Value | Count | Frequency (%) |
| CA | 460 | 13.5% |
| TX | 335 | 9.8% |
| FL | 301 | 8.8% |
| NY | 228 | 6.7% |
| MI | 125 | 3.7% |
| MD | 108 | 3.2% |
| IL | 105 | 3.1% |
| PA | 102 | 3.0% |
| GA | 101 | 3.0% |
| OH | 96 | 2.8% |
| Other values (42) | 1446 |
Length
| Value | Count | Frequency (%) |
| ca | 460 | 13.5% |
| tx | 335 | 9.8% |
| fl | 301 | 8.8% |
| ny | 228 | 6.7% |
| mi | 125 | 3.7% |
| md | 108 | 3.2% |
| il | 105 | 3.1% |
| pa | 102 | 3.0% |
| ga | 101 | 3.0% |
| oh | 96 | 2.8% |
| Other values (42) | 1446 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 1124 | |
| C | 727 | |
| N | 646 | |
| L | 523 | 7.7% |
| T | 477 | 7.0% |
| M | 464 | 6.8% |
| I | 369 | 5.4% |
| X | 335 | 4.9% |
| F | 301 | 4.4% |
| Y | 286 | 4.2% |
| Other values (14) | 1562 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 6814 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| A | 1124 | |
| C | 727 | |
| N | 646 | |
| L | 523 | 7.7% |
| T | 477 | 7.0% |
| M | 464 | 6.8% |
| I | 369 | 5.4% |
| X | 335 | 4.9% |
| F | 301 | 4.4% |
| Y | 286 | 4.2% |
| Other values (14) | 1562 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 6814 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| A | 1124 | |
| C | 727 | |
| N | 646 | |
| L | 523 | 7.7% |
| T | 477 | 7.0% |
| M | 464 | 6.8% |
| I | 369 | 5.4% |
| X | 335 | 4.9% |
| F | 301 | 4.4% |
| Y | 286 | 4.2% |
| Other values (14) | 1562 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 6814 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| A | 1124 | |
| C | 727 | |
| N | 646 | |
| L | 523 | 7.7% |
| T | 477 | 7.0% |
| M | 464 | 6.8% |
| I | 369 | 5.4% |
| X | 335 | 4.9% |
| F | 301 | 4.4% |
| Y | 286 | 4.2% |
| Other values (14) | 1562 |
HCP_SPECIALTY
Categorical
High cardinality  High correlation 
| Distinct | 78 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.0 KiB |
| FAMILY MEDICINE | |
|---|---|
| NURSE PRACTITIONER | |
| EMERGENCY MEDICINE | |
| INTERNAL MEDICINE | |
| PHYSICIAN ASSISTANT | |
| Other values (73) |
Length
| Max length | 50 |
|---|---|
| Median length | 43 |
| Mean length | 17.525976 |
| Min length | 7 |
Unique
| Unique | 25 ? |
|---|---|
| Unique (%) | 0.7% |
Sample
| 1st row | FAMILY MEDICINE |
|---|---|
| 2nd row | EMERGENCY MEDICINE |
| 3rd row | EMERGENCY MEDICINE |
| 4th row | PEDIATRICS |
| 5th row | PEDIATRICS |
Common Values
| Value | Count | Frequency (%) |
| FAMILY MEDICINE | 777 | |
| NURSE PRACTITIONER | 551 | |
| EMERGENCY MEDICINE | 540 | |
| INTERNAL MEDICINE | 478 | |
| PHYSICIAN ASSISTANT | 400 | |
| PEDIATRICS | 207 | 6.1% |
| ANATOMIC/CLINICAL PATHOLOGY | 68 | 2.0% |
| DIAGNOSTIC RADIOLOGY | 44 | 1.3% |
| INTERNAL MEDICINE/PEDIATRICS | 29 | 0.9% |
| OBSTETRICS & GYNECOLOGY | 27 | 0.8% |
| Other values (68) | 286 | 8.4% |
Length
| Value | Count | Frequency (%) |
| medicine | 1908 | |
| family | 785 | |
| emergency | 573 | 8.5% |
| nurse | 552 | 8.2% |
| practitioner | 551 | 8.2% |
| internal | 524 | 7.8% |
| physician | 400 | 5.9% |
| assistant | 400 | 5.9% |
| pediatrics | 232 | 3.4% |
| pathology | 79 | 1.2% |
| Other values (83) | 749 | 11.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 8795 | |
| E | 7981 | |
| N | 5845 | |
| C | 4302 | 7.2% |
| A | 4121 | 6.9% |
| M | 3444 | 5.8% |
| R | 3435 | 5.8% |
| 3346 | 5.6% | |
| T | 3196 | 5.4% |
| S | 2761 | 4.6% |
| Other values (19) | 12485 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 59711 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| I | 8795 | |
| E | 7981 | |
| N | 5845 | |
| C | 4302 | 7.2% |
| A | 4121 | 6.9% |
| M | 3444 | 5.8% |
| R | 3435 | 5.8% |
| 3346 | 5.6% | |
| T | 3196 | 5.4% |
| S | 2761 | 4.6% |
| Other values (19) | 12485 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 59711 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| I | 8795 | |
| E | 7981 | |
| N | 5845 | |
| C | 4302 | 7.2% |
| A | 4121 | 6.9% |
| M | 3444 | 5.8% |
| R | 3435 | 5.8% |
| 3346 | 5.6% | |
| T | 3196 | 5.4% |
| S | 2761 | 4.6% |
| Other values (19) | 12485 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 59711 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| I | 8795 | |
| E | 7981 | |
| N | 5845 | |
| C | 4302 | 7.2% |
| A | 4121 | 6.9% |
| M | 3444 | 5.8% |
| R | 3435 | 5.8% |
| 3346 | 5.6% | |
| T | 3196 | 5.4% |
| S | 2761 | 4.6% |
| Other values (19) | 12485 |
HCP_GENDER
Categorical
High correlation 
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 KiB |
| M-Male | |
|---|---|
| F-Female | |
| U-Unknown |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 7.0798356 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M-Male |
|---|---|
| 2nd row | M-Male |
| 3rd row | F-Female |
| 4th row | F-Female |
| 5th row | F-Female |
Common Values
| Value | Count | Frequency (%) |
| M-Male | 1738 | |
| F-Female | 1328 | |
| U-Unknown | 341 | 10.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| m-male | 1738 | |
| f-female | 1328 | |
| u-unknown | 341 | 10.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 4394 | |
| M | 3476 | |
| - | 3407 | |
| a | 3066 | |
| l | 3066 | |
| F | 2656 | |
| m | 1328 | 5.5% |
| n | 1023 | 4.2% |
| U | 682 | 2.8% |
| k | 341 | 1.4% |
| Other values (2) | 682 | 2.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 24121 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 4394 | |
| M | 3476 | |
| - | 3407 | |
| a | 3066 | |
| l | 3066 | |
| F | 2656 | |
| m | 1328 | 5.5% |
| n | 1023 | 4.2% |
| U | 682 | 2.8% |
| k | 341 | 1.4% |
| Other values (2) | 682 | 2.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 24121 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 4394 | |
| M | 3476 | |
| - | 3407 | |
| a | 3066 | |
| l | 3066 | |
| F | 2656 | |
| m | 1328 | 5.5% |
| n | 1023 | 4.2% |
| U | 682 | 2.8% |
| k | 341 | 1.4% |
| Other values (2) | 682 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 24121 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 4394 | |
| M | 3476 | |
| - | 3407 | |
| a | 3066 | |
| l | 3066 | |
| F | 2656 | |
| m | 1328 | 5.5% |
| n | 1023 | 4.2% |
| U | 682 | 2.8% |
| k | 341 | 1.4% |
| Other values (2) | 682 | 2.8% |
TARGET
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 166.5 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 2821 | |
| 1 | 586 | 17.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 2821 | |
| 1 | 586 | 17.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2821 | |
| 1 | 586 | 17.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3407 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2821 | |
| 1 | 586 | 17.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3407 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2821 | |
| 1 | 586 | 17.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3407 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2821 | |
| 1 | 586 | 17.2% |
Interactions
Correlations
| HCP_GENDER | HCP_ID | HCP_SPECIALTY | INSURANCE_TYPE | NUM_CONDITIONS | NUM_CONTRAINDICATIONS | PATIENT_AGE_DIAGNOSED | PATIENT_GENDER | PATIENT_ID | STATE | TARGET | TXN_LOCATION_TYPE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HCP_GENDER | 1.000 | 0.191 | 0.361 | 0.000 | 0.031 | 0.007 | 0.031 | 0.053 | 1.000 | 0.110 | 0.036 | 0.170 |
| HCP_ID | 0.191 | 1.000 | 0.193 | 0.086 | 0.076 | 0.039 | 0.036 | 0.058 | 0.031 | 0.192 | 0.092 | 0.150 |
| HCP_SPECIALTY | 0.361 | 0.193 | 1.000 | 0.232 | 0.135 | 0.000 | 0.215 | 0.106 | 1.000 | 0.079 | 0.181 | 0.321 |
| INSURANCE_TYPE | 0.000 | 0.086 | 0.232 | 1.000 | 0.183 | 0.047 | 0.358 | 0.000 | 1.000 | 0.141 | 0.079 | 0.269 |
| NUM_CONDITIONS | 0.031 | 0.076 | 0.135 | 0.183 | 1.000 | 0.423 | 0.612 | 0.000 | 0.331 | 0.050 | 0.127 | 0.084 |
| NUM_CONTRAINDICATIONS | 0.007 | 0.039 | 0.000 | 0.047 | 0.423 | 1.000 | 0.410 | 0.034 | 0.272 | 0.000 | 0.012 | 0.069 |
| PATIENT_AGE_DIAGNOSED | 0.031 | 0.036 | 0.215 | 0.358 | 0.612 | 0.410 | 1.000 | 0.088 | 0.374 | 0.048 | 0.241 | 0.120 |
| PATIENT_GENDER | 0.053 | 0.058 | 0.106 | 0.000 | 0.000 | 0.034 | 0.088 | 1.000 | 1.000 | 0.000 | 0.010 | 0.000 |
| PATIENT_ID | 1.000 | 0.031 | 1.000 | 1.000 | 0.331 | 0.272 | 0.374 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| STATE | 0.110 | 0.192 | 0.079 | 0.141 | 0.050 | 0.000 | 0.048 | 0.000 | 1.000 | 1.000 | 0.134 | 0.108 |
| TARGET | 0.036 | 0.092 | 0.181 | 0.079 | 0.127 | 0.012 | 0.241 | 0.010 | 1.000 | 0.134 | 1.000 | 0.203 |
| TXN_LOCATION_TYPE | 0.170 | 0.150 | 0.321 | 0.269 | 0.084 | 0.069 | 0.120 | 0.000 | 1.000 | 0.108 | 0.203 | 1.000 |
Missing values
Sample
| PATIENT_ID | PATIENT_GENDER | NUM_CONDITIONS | NUM_CONTRAINDICATIONS | TXN_DT | HCP_ID | TXN_LOCATION_TYPE | INSURANCE_TYPE | PATIENT_AGE_DIAGNOSED | STATE | HCP_SPECIALTY | HCP_GENDER | TARGET | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | M-Male | 1 | 0 | 2022-06-11 | 24633 | EMERGENCY ROOM - HOSPITAL | COMMERCIAL | 34 | TX | FAMILY MEDICINE | M-Male | 0 |
| 1 | 2 | M-Male | 1 | 0 | 2022-06-22 | 7777 | EMERGENCY ROOM - HOSPITAL | COMMERCIAL | 2 | PA | EMERGENCY MEDICINE | M-Male | 0 |
| 2 | 3 | M-Male | 1 | 0 | 2022-06-20 | 17051 | OFFICE | COMMERCIAL | 49 | MS | EMERGENCY MEDICINE | F-Female | 0 |
| 3 | 4 | M-Male | 1 | 0 | 2022-06-30 | 19478 | EMERGENCY ROOM - HOSPITAL | COMMERCIAL | 0 | PA | PEDIATRICS | F-Female | 0 |
| 4 | 7 | M-Male | 1 | 0 | 2022-06-06 | 8189 | INPATIENT HOSPITAL | COMMERCIAL | 1 | CA | PEDIATRICS | F-Female | 0 |
| 5 | 8 | M-Male | 1 | 0 | 2022-06-28 | 21499 | EMERGENCY ROOM - HOSPITAL | COMMERCIAL | 45 | WV | DIAGNOSTIC RADIOLOGY | M-Male | 0 |
| 6 | 9 | M-Male | 2 | 0 | 2022-06-30 | 841 | HOSPITAL OUTPATIENT | COMMERCIAL | 31 | ME | EMERGENCY MEDICINE | U-Unknown | 0 |
| 7 | 10 | F-Female | 1 | 0 | 2022-06-01 | 12379 | EMERGENCY ROOM - HOSPITAL | COMMERCIAL | 74 | FL | NURSE PRACTITIONER | F-Female | 0 |
| 8 | 11 | M-Male | 2 | 0 | 2022-06-23 | 11336 | HOSPITAL OUTPATIENT | MEDICARE | 32 | FL | EMERGENCY MEDICINE | M-Male | 0 |
| 9 | 12 | M-Male | 1 | 0 | 2022-06-22 | 10655 | EMERGENCY ROOM - HOSPITAL | COMMERCIAL | 2 | AL | PEDIATRIC EMERGENCY MEDICINE (PEDIATRICS) | M-Male | 0 |
| PATIENT_ID | PATIENT_GENDER | NUM_CONDITIONS | NUM_CONTRAINDICATIONS | TXN_DT | HCP_ID | TXN_LOCATION_TYPE | INSURANCE_TYPE | PATIENT_AGE_DIAGNOSED | STATE | HCP_SPECIALTY | HCP_GENDER | TARGET | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3397 | 4007 | M-Male | 3 | 0 | 2022-06-12 | 12091 | URGENT CARE FACILITY | COMMERCIAL | 25 | WA | EMERGENCY MEDICINE | M-Male | 0 |
| 3398 | 4008 | F-Female | 17 | 19 | 2022-06-24 | 12194 | TELEHEALTH PROVIDED OTHER THAN IN PATIENT'S HOME | COMMERCIAL | 59 | CA | EMERGENCY MEDICINE | M-Male | 1 |
| 3399 | 4009 | F-Female | 12 | 0 | 2022-06-14 | 16999 | TELEHEALTH PROVIDED IN PATIENT'S HOME | MEDICARE | 72 | CA | FAMILY MEDICINE | M-Male | 1 |
| 3400 | 4010 | M-Male | 9 | 61 | 2022-06-06 | 20038 | TELEHEALTH PROVIDED IN PATIENT'S HOME | MEDICARE | 74 | CA | INTERNAL MEDICINE | M-Male | 1 |
| 3401 | 4012 | F-Female | 4 | 0 | 2022-06-29 | 3491 | OFFICE | MEDICARE | 73 | IL | NURSE PRACTITIONER | F-Female | 1 |
| 3402 | 4015 | M-Male | 22 | 0 | 2022-06-20 | 5692 | OTHER PLACE OF SERVICE | COMMERCIAL | 72 | CA | NURSE PRACTITIONER | U-Unknown | 1 |
| 3403 | 4016 | F-Female | 71 | 6 | 2022-06-03 | 15294 | OFFICE | MEDICARE | 75 | NJ | INTERNAL MEDICINE | F-Female | 0 |
| 3404 | 4017 | F-Female | 64 | 54 | 2022-06-21 | 11575 | OFF CAMPUS-OUTPATIENT HOSPITAL | MEDICARE | 85 | IL | FAMILY MEDICINE | F-Female | 1 |
| 3405 | 4019 | M-Male | 15 | 39 | 2022-06-07 | 5402 | EMERGENCY ROOM - HOSPITAL | COMMERCIAL | 64 | TX | EMERGENCY MEDICINE | F-Female | 1 |
| 3406 | 4020 | F-Female | 29 | 1 | 2022-06-27 | 17966 | OFFICE | COMMERCIAL | 81 | WI | FAMILY MEDICINE | M-Male | 1 |